Monotonically convergent optimization in quantum control using Krotov's method 
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The non-linear optimization method developed by Konnov and Krotov [Automation and Remote 
Control 60, 1427 (1999)] has been used previously to extend the capabilities of optimal control the- 
ory from the linear to the non-linear Schrodinger equation [Sklarz and Tannor, Phys. Rev. A 66, 
053619 (2002)]. Here we show that based on the Konnov-Krotov method, monotonically convergent 
algorithms are obtained for a large class of quantum control problems. It includes, in addition to non- 
linear equations of motion, control problems that are characterized by non-unitary time evolution, 
non-linear dependencies of the Hamiltonian on the control, time-dependent targets and optimization 
functionals that depend to higher than second order on the time-evolving states. We furthermore 
show that the non-linear (second order) contribution can be estimated either analytically or numer- 
ically, yielding readily applicable optimization algorithms. We demonstrate monotonic convergence 
for an optimization functional that is an eighth-degree polynomial in the states. For the 'standard' 
quantum control problem of a convex final-time functional, linear equations of motion and linear 
dependency of the Hamiltonian on the field, the second-order contribution is not required for mono- 
tonic convergence but can be used to speed up convergence. We demonstrate this by comparing the 
performance of first and second order algorithms for two examples. 
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I. INTRODUCTION 



Quantum control of light and matter uses external 
fields to engineer constructive and destructive interfer- 
ences to steer a physical process into a desired direc- 
tion 0, 0. The idea was pioneered in the 1980s 0-0 
and gained wide-spread attention with the advent of fem- 
tosecond lasers and pulse shaping techniques Q. It was 
realized at about the same time that constructive and de- 
structive interferences need not be devised by hand but 
can rather be obtained employing concepts from engi- 
neering such as feedback and optimization @-[H||- This 
has significantly broadened the range of quantum control 
problems that can be tackled. In particular, optimal con- 
trol theory (OCT) has become a popular tool employed 
in areas as different as vibrational dynamics of complex 
molecules (l2l . fl3l, q uantum dots and rings T3 . 
tracold gases [16h19i |. multi-photon excitations 
nuclear ma gne tic resonance |22l [23j and quantum infor- 
mation [2414261 ]. 

All these applications are based on quantum dynamics, 
but differ in their (i) equation of motion, (ii) dependence 
of the Hamiltonian on the control, and (iii) target func- 
tional, (i) In standard applications of OCT to quantum 
control, the equation of motion is the linear Schrodinger 
equation, e.g., Refs. @, [H H3, [H|. OCT can also be 
applied to examples with a non-linear equation of mo- 
tion [HMEil which is obtained as an effective descrip- 
tion in the framework of the mean-field approximation, 
(ii) The dependence of the Hamiltonian on the control is 
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linear in most applications of OCT from atomic, molec- 
ular and chemical physics, reflecting a laser field driv- 
ing optically allowed transitions. This is changed to a 
non- linear dependence if multi-photon excitations [20l[2l| 
or a parametrization of the control field [l9| is consid- 
ered, (iii) Linear and bilinear target functionals have 
commonly been used in application of OCT to quan- 
tum control to date [22|- However, target functionals 
that are higher order polynomials in the states of the 
system ma y b e encountered in quantum information ap- 
plications [281 ] . The type of dynamics and functionals 
translates into different requirements that must be met 
by the optimization algorithm. In particular for non- 
linear equations of motion, non-linear dependencies of 
the Hamiltonian on the control, time-dependent targets 
where the target operator is non-semidefinite and tar- 
get functionals that are higher order polynomials in the 
states, it is not straightforward to construct monotoni- 
cally convergent algorithms. This is the problem that we 
address here. 

We utilize the non-linear optimization algorithm by 
Konnov and Krotov [29[ which had been translated to 
the language of quantum mechanics and first employed 
for solving a quantum control problem by Sklarz and 
Tannor [16j. Specifically, Sklarz and Tannor realized 
that a generalized form of the optimization functional 
yielding modified adjoint states is required in order to 
apply OCT to a non- linear Schrodinger equation [l6j |. 
We show that the work by Sklarz and Tannor applies 
also to quantum control problems that are characterized 
by non-unitary time evolution, time-dependent targets, 
non-linear dependencies of the Hamiltonian on the con- 
trol and target functionals that depend to higher than 
second order on the time-evolving states. Translating 
the original proof by Konnov and Krotov [29[ to quan- 
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turn control, we furthermore show that the parameters 
of the optimization algorithm can be estimated analyt- 
ically or numerically. We can thus give a ready-to-use 
prescription of the algorithm where the parameters are 
determined by the physics of the problem. For the case 
of a quadratic functional and linear equations of motion, 
a first order construction of Krotov's method is sufficient 
to guarantee monotonic convergence [lfl [27| ■ We show 
that in this case, a second order contribution may still 
be used to speed up convergence. 

The paper is organized as follows. The optimization 
algorithm and the estimate of the algorithm's parameters 
are presented in Section [TT] with all mathematical details 
of the derivation found in the appendices. Sections ITO1 
and IIVI are devoted to applications of the optimization 
algorithm. We demonstrate monotonic convergence for 
one example that requires a second order construction in 
Section IIII1 and we present two examples in Section IIVI 
for which the second order construction is not required 
but may be used to speed up convergence. Section IVl 
concludes. 



II. OPTIMIZATION ALGORITHM 

A. Control problem 

The control problem is characterized by stating the 
control target and possible additional costs in functional 
form, 



J = J T [{ip k (T)}]+ ( J t [{<p k (t)},e(t)] dt, (1) 
Jo 

where we have separated final-time T and intermediate- 
time t 'costs'. {ipk{t)} denotes a set of complex state 
vectors [46| and e(i) is the external field or control that 
shall be optimized. The final-time cost is typically spec- 
ified in terms of some desired unitary operator O, for 
example (27j . 



J T [{(fk (T)}] = - A | Tr {&P N U(T, 0; e)P N } 



(2) 



N 2 



N 

(<Pk(t = 0)|6 + 0(T,0;e)K(t = 0)){<fk>(t = 0)|0(T, 0; e^6\ip k ,(t = 0)) . 



fc,fc'=i 



Here, U(T, 0; e) denotes the time evolution operator from 
the initial time t = to the final time T under the ac- 
tion of the field e, and Ao is a weight. The dimension 
of the subspace of the total Hilbert space, H, on which 
the target operator O acts, Ho, is denoted by TV, and 
Pat is the projector onto Ho- { | V^fc (£ = 0))} = {\k)} is 
a suitable orthonormal basis spanning this subspace. A 
single state-to-state transition is obtained by taking O 
to be the projector onto the target state, i.e. N = 1. In 
order to optimize one-qubit or two-qubit quantum gates, 
N = 2 and TV = 4, respectively. The functional is nor- 
malized by 1/N 2 such that the optimum corresponds to 
the weight Jt = — Ao- 

The functional of Eq. @ is quadratic in the states 
at final time {tpk(T)}. A linear dependence is obtained 
by taking the real part instead of the square modulus 
of the trace and yields a phase-sensitive functional [27j ■ 
Generally, expressing the functionals in terms of expec- 
tation values yields at most a quadratic dependence on 
the states. A functional that is a higher order polynomial 
in the states is obtained in the context of quantum infor- 
mation when optimizing for a certain degree of entangle- 
ment rather than a specific unitary transformation [28| . 
see Section UTTl below. 



The intermediate-time cost Jt, 

J t [{Mt)},e] = g[{\<Pk{t))},<t),t] 

= g a [e(t),t]+g b [{cp k (t)},t] , (3) 

is typically used to minimize the field intensity and to 
switch the field smoothly on and off, 

9aHt)] = W) [ £ (*)- e ~f(i)] 2 , (4) 

where e re f(t) denotes some reference field, S(t) is a pos- 
itive (shape) function and A a a weight. Jt can also be 
used to formulate time-dependent targets [30l - l32j or con- 
straints that depend on the state of the system at inter- 
mediate times [33| such as 

\ n 

9b [Wm] = 5> fe (*)|D(t)K(t)) , (5) 

k=l 

where the dependence on the states again is quadratic. 
While complicated dependencies of g a [e] and g b [{<^fc}] 
are conceivable, we require g a [e] and g b [{<Pk}) to be ad- 
ditive, cf. Eq. ((3]). This assumption is typically justified 
since costs or penalties involving the field are usually not 
related to costs concerning the dynamics of the system. 
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The time evolution operator required to evaluate the 
functional, Eq. @, can be obtained by solving the equa- 
tion of motion for each of the basis states, 



dt 



\<p h {t)) = -jfibkAMt)) > 

= |/ fc (vJfc,e)) , k = l,...,N. 



(6) 



An explicit dependency of the Hamiltonian on the state, 
H[iy9fc], will occur for non-linear Schrodingcr equations 
such as the Gross-Pitaevski equation or Hartree-Fock- 
like equations where the Hamiltonian is second order in 
the states [H H H Hi . The dependency of the Hamil- 
tonian on the field can be linear, corresponding to one- 
photon dipole coupling, or higher order for non-resonant 
multi-photon transitions. Equation © can be extended 
to account for non-unitary time evolution by considering 
the density operator to be a vector in Liouville _space and 
replacing the Hamiltonian by the Liouvillian [351 ] . 



It is sufficient to expand the functional 3>({| t) up to 
second order in the states to ensure the necessary mono- 
tonicity conditions for arbitrary change of the state vec- 
tors, 



(xk(t)\Mt)) + (Mt)\xk(t)} 



-lj2( A Mt)\Hm<pi(t)), 



(8) 



kl 



B. Optimization equations 



The essence of Krotov's method [T|| consists in dis- 
entangling the intcrdcpcndcncy of the states and the con- 
trol by constructing an auxiliary functional L[{(pk}, e, $] 
that depends on the states, the control and an arbitrary 
scalar potential $ such that for any <£>, L[{ipk}, e, $] and 
J[{(pk\, e] are identical and minimization of L[{cp k }, e, $] 
is completely equivalent to minimization of J[{y>fc}, e]. 
This is achieved by adding a vanishing quantity, 

L[{<p k },e,$] = G({MT)})-HWk(0)},0) 

- f R({ip k (t)},e(t),t)dt (7) 
Jo 

with 



with first order expansion coefficients Xk{t) and Aip k (t) 
the change in the state <fik(t), given by A(p k (t) = 
Vfc 1 \fy ~~ fk\t) where the superscripts (i) and (i + 1) 
denote propagation under the old and new fields, eW (t) 
and e^ +1 ^(t), respectively. Due to the freedom of choice 
in <!>, the operator 6r{t) in the second order term can be 
chosen to ensure the extremum condition with respect to 
the state changes. If J is to be minimized, a(t) will be 
chosen to maximize it with respect to the change in the 
states. Any chan ge i n the field leads then to monotonic 
decrease of J [liH |29| . 



G[Wk(T)}] 
R[{tp k (t)},e(t),t] 



J T [{<p k (T)}} + m<p k (T)},T), 
-(g a [e(t),t}+g b [{<p k (t)},t}) 



N 



k=l 



fc=i 



Requiring the first order derivatives of L with respect 
to \(p k (t)), ((p k (t)\ and e(t) to vanish, yields a set of cou- 
pled control equations for the first order expansion coef- 
ficients Xk(t) and the field. Since all involved functionals 
are real, it is sufficient to state the equation for the kets, 
the bra equations being given by the adjoints, 



|lxP(*)> 



= 4A^«e«]|x«>-i 

+Vtou\9b\\ v M (t ))> 



~ V (v> k \ J T\\ v f(T)) ' k = l,...,N. 



(9a) 
(9b) 



The 'initial' condition at the final time T is given in terms of the gradient with respect to the states of the final-time 
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cost, Jt- Note that all gradients in Eqs. © arc evaluated prescription for the new field is obtained by evaluating 
with the states <p^ that arc forward propagated under the derivative of the constraints with respect to the field, 
the old field, eW as indicated by the superscript (i). The 



de 



= 23m 



' N 

E 



xf{t) 



dH 

de 



1 N l 



k=l 



dH 



de 



(*) 



(10) 



It involves backward propagation of the adjoint states 
under the old field, xi an d forward propagation of 
the states, ipu +1 \t), under the new field, 

- -^H[^ +1) , £ (l+1) ]|^ +1) W)(Ha) 
l^ I+1) (0)) = |fc), fc = l,...,JV. (lib) 
Equations ((9l)- (fTTj) need to be solved simultaneously. 
I 



The iteration is started by propagating Eqs. (fTT|) under 

some guess field, e(°\t), to obtain tp^'(T) and evaluate 
Eq. (|9b[l . The equation for the backward propagation, 
Eq. (|9a|) . becomes an inhomogeneous Schrodinger equa- 
tion for non-linear equations of motion, cf. the terms in 
square brackets, or if the intermediate-time cost, de- 
pends on the states (V^ k \gb ^ 0). If the time-dependent 
cost over the field takes the form of Eq. Q , the equation 
for the new field reads 



(*) 



£ref(*) 
S(t) 



(12) 



/v 



3n> EUi l) W 



„k=i 



dH 

~dl 



1 N I 



*!=i 



<9H 



0>c: 



Moreover, for dipolc transitions the Hamiltonian is given 
by H = Ho + A e (0; an( i hence <9H/<9e = p,. We 
thus recover the familiar prescription for the change in 
field obtained for a first order construction of the algo- 
rithm [I?} plus an additional second order contribution, 
given in terms of the change in the states, Atp^ +1 \t), 
with 'weight' <r(t). 

A choice of a(t) that guarantees a maximum of L with 
respect to the states (i.e., a positive second derivative of 
R and negative second derivative of G) is given by [2!| 

ait) = e*(T-t)(C-A)-Z for B * ,(13a) 

a(t) = C(T — t) — A for B = . (13b) 

The physics of the problem, i.e., the dependency of the 
functional on the states, the dependency of the Hamil- 
tonian on the control, (non-)linearity of the equation of 
motion governing and unitary or non-unitary of the time 
evolution, determine the parameters A, B and G, 

A = max(£ J 4,2A + e j4 ) , (14a) 
B = 2B + e B , (14b) 
C = min(-e c ,2C-£ C ) , (14c) 



where the e$ are non-negative numbers that can be used 
to enforce strict inequality [47j . It should be emphasized, 
that only in cases where A, B and G all turn out to 
be zero, the linear version of Krotov's method [27( is 
sufficient to guarantee monotonic convergence. 



C. Parameters of the second order contribution 

For quantum control problems the parameters 
A, B and G can either be estimated analyt- 
ically or approximated numerically by considering 
AG = G({^ +1) (T)}) - G({^\T)}) and AR = 
i?({^ +1) (t)}, c W(t), t) - R({^ (t)}, e« (t), t) and guar- 
anteeing their negativity and positivity, respectively. The 
analytical estimate is based on a worst case scenario and 
strictly guarantees monotonic convergence. The worst 
case scenario may, however, not always be required and 
a more efficient while less fail-proof approach is given by 
numerical approximation of A, B and C . 

The analytical estimate of A is obtained by requiring 
AG < 0. This is guaranteed if 
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A = sup 



Ef=i liXk(T) | A^ fc (T)) + (A^(T) | »(T)>] + J T (T) + A^(T)}) - J T (rf(T)} 



Ef=i[(A^(T)|A^(T)>] 



(15) 



where the supremum needs to be taken over all sets 
of possible state change vectors {Aip k (T)} with norm 
EfeLi (Ay>fc(T) I Aifik(T)) between zero and 2N. Konnov 
and Krotov proved [29( that, under certain conditions 
which are almost trivially fulfilled for quantum systems, 
quantities like the one on the right-hand side of Eq. (|15[) 
and similar ones obtained for B and C exist and are well- 
defined, see also Appendix [X] We discuss in Appendix 151 
how Konnov's and Krotov's proof simplifies for quantum 
systems such that the supremum in Eq. (fT5"j) can be es- 
timated. Specifically, we show in Appendix [C] that the 
argument of the supremum in Eq. (|15[) can be rewritten 
in terms of a Taylor series of Jt that starts at the sec- 
ond order. The series can be estimated by its Lagrange 
remainder, 



1 



A < - sup d a J T ({A<p k (T)}) 
1 {A ¥ > )! };l"l=2 



(16) 



i.e., A is given by the supremum over the second deriva- 
tives of the final-time functional, Jt , with respect to the 
states, ipk(T). The multi- index a and derivative d a are 
defined in Eqs. (|C2[) and (|CJ3|) . respectively. For function- 
al Jt that are linear or convex in ip k (T), i.e. those for 
which the second derivatives vanish or are always non- 
positive, A < 0. For simplicity one can then choose 
A = 0, e A = such that ^4 = 0. 



The analytical estimates of B and C are obtained by 
requiring AR > where 



N 



AR({Av(t)},t) = ^[(A^(t)|A^(t))] 



fc=i 



Ef=i [<**(*) I A%(*)> + (Ap fc W | * fc (t)> + (Xk(t) | A/ fc (i)) + (Af k (t) | Xfc (t)>] - A. 9 



-&{t)+a(t) 



ELi [(Ay fc (t) | A/ fc (f)) + (Af k (t) | A^(t))] 
Ef =1 [(A^(t)|A^(i))] 



Ef=i [(A^(i)|A^(t))] 



.(17) 



The second summand in the square brackets determines 
B and the third one C . Af k and Ag describe the change, 
due to changes in the states, in the equations of motion, 

\Af k (Acp k ,t)) = \f h (tp®{t)+A<p h (t),e®(t),t) 

HAtei°(t),e«(t),t)>, (18) 



and in the constraint, 



A 5 ({A Vfc },t) = fl({^(t) + A Vfc (t)},e (<) (t),t) 



(19) 



B is given by 



B 



sup 

{A Vfc } ; te[o,T] 



Etx [(A^W I A/ fc (i)> + <A/ fc (t) | Ap fc (t)>] 



Ef =1 [(A<p k {t)\A<p k (t))] 



(20) 



and can be rewritten in terms of a supremum over the pendix [U] Estimating the Taylor series by its Lagrange 
Taylor expansion of the Hamiltonian starting at first remainder, we obtain 
order and a supremum over the Hamiltonian, cf. Ap- 
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B < 2 



TV sup 

te[o,T] 



d a H (A<p k (t)) 



2 sup 

{A Vk };te[0,T] 



Ek=i3 m ( A Mt) H(A^(t),eW J t) Av> fc (t) 



. (21) 



i.e., i? can be estimated by the supremum over the first 
derivative of the Hamiltonian with respect to the states 
ip(t) and a term which can be interpreted as twice the 
maximum absolute value of the imaginary part of the 
Hamiltonian's eigenvalues. For unitary time evolution 
governed by the standard (linear) Schrodingcr equation, 
B = can easily be proven and in this case a(t) is a 



linear function of time. It is also possible to take B = 
for non-unitary time evolution with linear equations of 
motion provided A and C can taken to be zero. For non- 
linear equations of motion, the supremum of the first or- 
der derivatives of the Hamiltonian needs to be evaluated 
explicitly C is found to be 



Eti [<**(*) | Ap k (t)) + (AMt) I **(*)) + (Xk(t) I Af k (t)) + (Af k (t) | X k(t))] - Ag 



C 



inf 

{A<p k };tG[0,T] 



EtA(^ k (t)\Acp k (t))] 



(22) 



As shown in Appendix [Cj it can be rewritten in terms of suprema of the Taylor series of the Hamiltonian and the 
constraint, g, starting at first and second order, respectively. Estimating the Taylor series by their Lagrange remainder, 
we obtain 



A' 



C > 2J2 



k=l 



X f(t) X f(t))- sup 

' A Vk ;t£[0,T] 



d a H(Aip k ,t) 



\a\=l 



sup [d Q 5 ({A<M,i)] , 

{Ayj fc };te[0,T] 
\a\=2 



(23) 



i.e., C is given by the supremum of the first order deriva- 
tives of the Hamiltonian multiplied by the norm of the 
costates \k (t) and the supremum of the first order deriva- 
tives of the constraint, g. For linear equations of mo- 
tion and gb zero or linear in ip k (t), we find C = 0. The 
case of a quadratic dependency of g on the states ip k , 
cf. Eq. ([5]), can also easily be handled. The second term 
in the right-hand side of Eq. (|23[) can then be estimated 
by the eigenvalue of the operator D(t) with largest mag- 
nitude. For example if D(t) is the projector onto some 



subspacc, D(i) = P, then 



C < - maxfPl 

- NT ev 1 J 



— ■ (24) 
NT 



In order to derive a numerical approximation for the 
parameters A, B and C, we assume a finite time grid, 
{tj}, j = 1, . . . , n, and define, based on Eqs. (TT5j) . (|2T))) 

and (22), parameters A^ i+1 \ B ( j +1) and cf +1 \ 



A^){{A<p k }) 
and 



££=i liXk (T) | A^ fc (T)) + (Aip k (T) | Xk (T))] + J T (rf(T) + A^(T)}) - J T ({^ } (T)) 



Ef =1 [(A^ (T)|A^ (T))] 



B (l+1) ({A^}) = Ef=i [<Ay fc fa) | A/ fc (A^.,^-)> + (A/ fe (Aip k ,tj) \ A<p k (tj)) 



C? +1 \{A<p h }) = , 



Efc=i I(A^ (*;) I A<p k (tj))] 



E fe =i [( A V5fe fe)|A^ fe)>] 

r A 7 

E [ (X* I A W= (*i)> + (A^fe (tj) I Xfe (tj)) + 



fe=i 



(Xk fo) I A/ fc (A Vfc , tj)) + (A/fe (Ap fc) tj) | Xfc (tj)) ] - A 5 ({A^}, tj) 
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At iteration step + a numerical estimate for the sec- 
ond order parameter. (t), is obtained by replacing 
A, B, and C in Eqs. (0 by A^ i+1 \ Eq. ([25]), together 
with 



B^ = supi?f +1) 



= infcf +1) 



(26a) 
(26b) 



repeated with the values of A^ t+1 \ Bf +1) and cf +1) that 
were obtained during the failed step. Numerical estima- 
tion of the second order parameters enforces monotonic 
convergence with respect to a change in the states as gen- 
tly as possible, making use of the optimization history to 
find a compromise between monotonicity and speed of 
the convergence. 



and inserting the resulting numerical estimates of A, B, 
and C into Eqs. JT2J). Since the new states ip^ +1 \t) — 



iff (t) + Aip(t) are not known, A^ l+1 \ Bf +1) and cf +1) 
need to be approximated, for example by the values of 
A^\ Bf and Cf calculated in the previous iteration. 
In rare cases, this approximation might lead to loss of 

monotonic convergence. The iteration then needs to be lates into tstt < 0, or, 



D. Monotonic convergence for arbitrary 
dependence of the Hamiltonian on the control 

In any iterative optimization, convergence with respect 
to the field can only be expected towards a local ex- 
tremum. Here, the local extrcmum condition on J trans- 

d 2 R 



J 



d 2 g a 



de 2 




^ +1) w) + E(^ +1) w 



d 2 H' 



de 2 



9 2 H f 



de 2 







e (<+i), v C*+i) 





For a linear dependence of the Hamiltonian on the con- the weight A a for the typical quadratic dependence of g a 



trol, 



a 2 H _ d z H< 

Fie* 



Or 2 



simply > 0. This translates into the sign of 



0, and a maximum in R requires on e > cf - Ec l- ©■ Inserting the corresponding derivative 

of g a , we obtain 



(*) 



S(t)„ 




Ik 



: (i+l).u,Ci+l) 



JY 





<9H 




de 









r 



For a non-linear dependency of the Hamiltonian on the 
control, we define the change in the intermediate-time 
functional due to changes in the control, 

Kit) = R({^ +1) (t)},e^ +1 Ht),t) 

-R({^ +1 \t)},e^(t),t). (29) 



The strict maximum condition for R becomes A e (f) > 
V£. We assume J t , cf. Eq. (O, to be additive. Rewrit- 
ing H[^ l+1) ,e( 4+1 M - H[<p£ +1) ,eM,t] in terms of the 
Taylor expansion of the Hamiltonian with respect to the 



control, we find the zeroth order term to vanish. The first 
order derivative can be rewritten using Eq. (|10j) . The re- 
maining terms correspond to the Taylor series starting 
at second order which can be estimated by its Lagrange 
remainder, 



M|(f) 



sup 



de 2 



H(yjfe,e,t) 



(30) 



Employing furthermore Ylk=i V (A<Pk(t) \ Atp k (t)) < 



2V N, we obtain 
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NA'K 



W E \/ I Xk°(*)) + k(*)I^M a e (t)j (e( i+1 )(t)-eW(t)) 2 , 
fc=i ' J 



r 



The Lagrange remainder can be evaluated by taking the 
second derivative of the Hamiltonian with respect to the 
field and estimating the norm of the resulting operator 
by its spectral radius or its eigenvalue with largest square 



modulus. Since it is difficult to proceed without a spe- 
cific dependence of g a on e, we assume a quadratic de- 
pendence, cf. Eq. (j4|), and find 



A e (t) > 



Aq 

S(t) 



fc=i L 



Af 2 e (t) +iY|a(t)|M 2 e (i) 



( e «+D(t)- e W(t)) a . (31) 



Monotonic convergence is ensured by adjusting the shape function S(t) and the parameter A a such that 

N 



S(t) 



> 



k=l 



xiHt) x]«(t) 



M 2 e (i) + iV|cr(i)|M|(i) 



(32) 



Unlike the algorithm of Ref. [2l| , the numerical effort in 
our approach to ensure monotonic convergence is inde- 
pendent of the order of the non-linearity of the Hamilto- 
nian's dependence on the control. 



To summarize section [Til a second order construction 
of the optimization algorithm yields an additional contri- 
bution to the equation for the new field. Compared to the 
linear variant of Krotov's method (27j . it simply requires 
additional storage of the states from the previous itera- 
tion to determine Aipk (t) and calculation of the function 
a(t). The parameters A, B and C determining a(t) turn 
out to be zero, i.e., the second order contribution van- 
ishes, for the 'standard' quantum control problem with 
bilinear final-time cost, intermediate-time costs that are 
independent of the states and linear equations of motion. 
A second order contribution can nevertheless be invoked 
to study its influence on convergence. This is investigated 
below in Section [IV Bl A second order contribution is re- 
quired to guarantee monotonic convergence for final-time 
costs that are a polynomial of higher than second order 
in the states [28|, demonstrated below in Section ITO1 and 
for general intermediate-time costs that depend on the 
states, studied below in Section IIV CI A second order 
construction is also required for non-unitary time evolu- 
tion and for non-linear equations of motion [l6l . |3~H . 



III. APPLICATION I: HIGHER ORDER 
POLYNOMIAL COSTS 

A functional that is an eighth order polynomial in the 
states arises in quantum information when optimizing for 
a local equivalence class of two-qubit gates, [O], rather 
than a specific two-qubit gate, 6 [28| . The functional is 
based on the local invariants of two-qubit gates [36| which 
uniquely characterise a local equivalence class. The ex- 
plicit, somewhat lengthy expression of J^ 1 is given in 
Ref. [28j]. 

We employ J^ 1 to optimize for the local equivalence 
class of the B-gate [37| , given in the logical basis by 



>b = e< 



/cos(tt/8) isin(7r/8)> 

sin(7r/8) icos(7r/8) 

icos(tt/8) sin(7r/8) 

Vsin(7r/8) cos(tt/8) 



for an effective spin-spin system 



H 



eff 



8 



3 

E 

i,3=0 



(33) 



The effective Hamiltonian is derived, within second-order 
perturbation theory, for trapped polar molecules with 
2 £i/2 electronic ground states, subject to near-resonant 
microwave driving that induces strong dipole-dipole cou- 
pling 38(. f2(£) denotes the Rabi frequency that will 
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FIG. 1: (color online) Convergence of the local invariants 
functional that optimizes for a local equivalence class rather 
than a specific operator and is an eighth order polynomial in 
the states. The optimum corresponds to J^ 1 = 0. 



be optimized, <x; are the 2x2 Pauli spin matrices, 
i = 1,2,3 = x,y,z with cto = B-2- The tensor ay de- 
pends on the distance xq between the molecules and on 
the polarization and detuning of the microwave field. We 
consider here CaCl molecules in an optical lattice with a 
lattice spacing of 300 nm, microwave radiation of about 
9.13 GHz, polarizations a± = 1/V2, cto = 0, and a de- 
tuning from the rotational transition of 1.2 kHz. 

The higher than quadratic dependence of J^ 1 on the 
states leads to a non-zero analytical estimate of A, cf. 
Eq. (Unj) - Specifically, one has to calculate all second 
derivatives of the eighth order polynomial and evaluate 
the supremum by considering an arbitrary change in the 
states up to \\Aip\\ = || J2 k < 2-//V, cf. Appcndix|B] 

(Fig. IB]). For our example, N = 4 and from the second 
derivatives, we obtain A = 45 although in practice a 
value of A = 5 was sufficient to preserve monotonic con- 
vergence. Figure [1] demonstrates that this choice of A 
(blue dashed line) indeed ensures monotonic convergence 
independent of other parameters of the algorithm such as 
the weight A a , cf. Eq. (fj|. Note that in this example con- 
vergence of the final-time functional Jt and the complete 
functional J are equivalent due to our specific choice of 
9a [13_and g b — 0. The latter yields C = 0. Further- 
more 5 = since our equation of motion is linear in 
the states. The first order algorithm (solid red line) fails 
completely for small A a and violates monotonic conver- 
gence for many iteration steps albeit finding an optimum 
eventually for intermediate A a . The weight A a determines 
the step size for changes in the field, cf. Eq. (fT2")l : small 
A a leads to large values of the field and thus a possibly 
more 'erratic' optimization, while for large A a , optimiza- 
tion proceeds more cautiously, explaining that even the 
first order construction is found to converge monotoni- 
cally. This is, however, due to the simplicity of our con- 
trol problem which is essentially one-dimensional since 
success is determined by the integrated field amplitude. 



For more complex Hamiltonians, optimization emplo yin g 
a first order construction was found to always fail [28| . 
underlying the significance of the second order construc- 
tion. The analytical estimate of the second order param- 
eter takes all worst-case scenarios into account. There- 
fore values smaller than the analytical estimate might 
already be sufficient for monotonicity which is confirmed 
as shown in Fig. [T] by the blue dashed line representing 
A = 5 and the black dotted line representing A = 1 even 
though the analytical estimate is given by A = 45. Such 
a less conservative estimate yields, moreover, signficantly 
faster convergence for small A a , a fact that was also con- 
firmed for more complex Hamiltonians J28|. 



IV. APPLICATION II: BILINEAR COSTS 

For bilinear costs, the supremum estimation of C yields 
zero. We consider the dynamics of our examples to be de- 
scribed by the standard Schrodinger equation such that 
B = and B can also be chosen zero. A second-order 
construction of the optimization algorithm becomes nec- 
essary if bilinear intermediate-time costs are employed, 
gb 0, which lead to non-zero C, cf. Eq. (f2U|) . If 
gb = 0, a second-order construction is not required to 
ensure monotonicity but can be utilized to speed up con- 
vergence. 



A. Model 

A simplified model for the vibrations in an Rb2 
molecule that linearly interacts with a laser field takes 
three electronic states, A 1 S+(5s + 5s), 1 S+(5s + 5p) and 
1 n g (5s + 4d), into account |33l. l39l|. 

3 

H = ^H J ®|e l )(e l |+/ie(i)- (|ei)(e 2 | 

i=l 

+ |e 2 )(e 1 | + |e 2 )(e 3 | + |e3)(e 2 |). (34) 

Here, denotes the vibrational Hamiltonians, = 
T + Vi(R), with kinetic and potential operators T and 
Vi(R), respectively, ji is the transition dipole operator, 
assumed to be independent of R, and e(t) the electric 
field. The potentials are found in Ref. [40|. The vibra- 
tional Hamiltonians are represented on a Fourier grid (4l| 
with Nr grid points yielding a total Hilbcrt space dimen- 
sion of M = 3Ar. The equations of motion are solved 
by the Chebychev propagator for homogeneous and inho- 
mogeneous Schrodinger equations, respectively (39L l4~D|. 
An initial field of the form 



,(0) 



(t) = e s{t) cos(fii) 



(35) 



is employed with eo the maximum amplitude and Q the 
central frequency of the field. The shape function s(t) 
is taken to be s(t) — sin 2 (7rt/T), where T corresponds 
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bution to the change in the field is closely related to the 
gradient of the functional. [48| Since the gradient van- 
ishes close to the optimum, convergence of the first order 
construction slows down as the optimum is approached. 
Variation of the non- negative number, sa, shows that an 
optimal choice of £a exists. However, this optimal choice 
cannot be determined a priori. In terms of convergence 
speed close to the optimum, it is therefore rccommend- 
able to employ the numerical estimate of A (red solid 
line in Fig. [51). Very similar behavior is found for opti- 
mization of a unitary transformation, a Hadamard gate 
(N = 2) on the lowest two vibrational eigenstates of the 
electronic ground state (data not shown). 



FIG. 2: (color online) Convergence of the first order and sec- 
ond order constructions of the optimization algorithm as mea- 
sured by the final-time objective, Jt, for state-to-state trans- 
fer from vibrational level v — 10 to v — 0. 



to the optimization time. The weight of the final-time 
objective, Ao, is taken to be one such that the optimum 
corresponds to — Jt = 1. 



B. State-independent intermediate-time cost 

(9b = 0) 

We investigate optimization of a state-to-state transfer 
(N = 1), taking for simplicity only the electronic states 
X 1 E+(5s+5s) and 1 £+(5s+5p) into account. The initial 
and target states are taken to be the vibrational eigen- 
states v = 10 and v = of the X 1 S+(5s + 5s) electronic 
ground state. With a vibrational period of the initial 
state of 614 fs, the optimization time is set to T = lps. 
The central frequency and maximum field amplitude are 
taken to be f2 = w v =io^ru'=o an d erj = 1 ■ 10 -2 a.u. 

Convergence of the final-time objective Jt is shown in 
Fig. [3J comparing first order (black circles) and second 
order constructions of the algorithm. The second order 
construction is determined by the choice of A which can 
be taken to be equal to some non- negative number, sa, 
cf. Eq. JTJ} (dotted and dashed lines in Fig. or A = 
(first order optimization, black circles in Fig. [2]) . The nu- 
merical estimate of A, cf. Eq. ([25]) . is represented by the 
solid red line in Fig. [5] The latter choice might speed up 
convergence, but is more risky: Since A = 2A^ l+1 ^ (Atp) 
can become negative, the condition for monotonic conver- 
gence may be violated. This is clearly seen in Fig. [2j In 
the lower inset, monotonic convergence is lost for one step 
after the first iteration step. We find in this case, that the 
state change is almost maximal, \\Aip\\ = 1.95 < 2, i.e., 
the worst possible case that the optimization algorithm 
must deal with is reached. While the first order construc- 
tion converges faster initially, the upper inset shows that 
all second order constructions supersede the first order 
one as the optimum is approached. This is readily un- 
derstood by inspection of Eq. (TT2"j) : The first order contri- 



C. State-dependent intermediate-time cost (gb 7^ 0) 

State-to-state transfer from v = to v = 1 and the 
Hadamard gate on the lowest two vibrational eigenstates 
of the electronic ground state, 

are optimized, taking into account an additional state- 
dependent cost, gb (f,t). If both a state-dependent cost 
and a final-time target are present, the algorithm seeks 
to optimize a compromise between the two goals. The 
parameters Ao and A& determine the relative weight of 
each target. Monotonic convergence always refers to the 
value that the total functional, J of Eq. (TT]), takes; and 
each separate contribution to J does not need to converge 
monotonically. Below we will discuss convergence of both 
J and Jt- In order to render the optimal value of J 
independent of the choice of the weights Ao, Af,, we define 
a normalized functional, 

Jnorm = T T - j A& < , (36a) 

A& — Ao 

Jnorm = 1 T~ ^ , A 6 > , (36b) 

that converges toward one. 

The cost gb is employed to avoid any population trans- 
fer to a forbidden subspace, taken to be the 1 n g (5.s + 4<i) 
state, at all times t <G [0,T] [13 • This can be expressed 
by taking the operator D in Eq.([5j) to be one of the two 
choices, 

D = P a iiow = |ei)(ei| + |e 2 )(e 2 |, A b < (37a) 
D = Pforbid = |e 3 )(e 3 |, \ b > 0, (37b) 

where P a iiow and Pf or bid denote the projectors onto the 
allowed and forbidden subspaccs, respectively The al- 
lowed subspace is formed by the X 1 S+(5s + 5s) and 
1 S+(5s + 5p) states. The different signs of the weight 
Xb indicate maximization of P a iiow and minimization of 
Pforbid which is physically equivalent. Mathematically, 



the equivalence does not hold since gb is a constraint and 
therefore must have its sign opposite to that of Jt- This 
is possible only for the choice of Eq. (|37a|) . Note that the 
'wrong' sign of Eq. (|37b[) exemplifies the more general 
class of indefinite operators D(i) for which it is not pos- 
sible to construct a monotonically convergent algorithm 
using the previously available tools by a simple change 
of sign. 

With the choice of gb according to Eqs. (f3"T|). the an- 
alytical estimate of the parameter C of the second or- 
der contribution is given by Eq. (|24p. Writing explicitly 
the change in the intermediate-time contribution to the 
functional due to the change in the states for a first order 
algorithm, 



1 N 

- Xb Wr ^(AvfcWIDIA^i)). 



NT 

fe=i 

we find the necessary condition for monotonic conver- 
gence, A v ( t ) > 0, to be always fulfilled for D = P a iiow A 
second order construction is therefore not required [33|, 
corresponding to C = in accordance with Eq. In 
this case, C can be set to —ec, cf. Eq. (|T4| . where ec 
is a non-negative number to check for improved conver- 
gence with a second order algorithm. However, if the 
projector onto the forbidden subspace is employed in gb, 
A v ( t ) > is not necessarily fulfilled and a second or- 
der construction is required to ensure monotonic conver- 
gence. Eq. (|24p now yields a large negative number for 
C since Xb is negative, and C is determined by C. Our 
approach goes beyond the results of Ref. |33| where a 
convergent (first order) algorithm was obtained only for 
negative semi-definite operators At>D(i) in the additional 
constraint gb- The second order construction of Eq. (|2"5)l 
allows for a larger class of operators D(t) in the state- 
dependent constraint gb- 

The final time is set to T = 2 ps, the central frequency 
of the guess field is chosen to be f2 = oj v =o^-v'=w and 
eq = 2 • 10~ 4 a.u. A is taken to be zero since our emphasis 
is on the choice of the parameter C. 

In analogy to the previous subsection, we first study 
the case D = P a iiow where the second order construction 
is not required but may improve convergence. Figure [3] 
compares the convergence of first and second order con- 
structions. Taking C to be equal to — ec, cf. Eq. ([14")) 
(dotted and dashed lines in Fig. [3]) , does not affect mono- 
tonicity. However, it also does not yield faster conver- 
gence than the first order algorithm (black circles). The 
numerical estimate, C = 2C^ +1 ^(A(^) (solid red line in 
Fig- 13) cf. Eq. (|26b|) . neglecting ec in Eq. (|T4")l is some- 
what risky since C^ l+1 ^ (Aip) can become positive such 
that this choice of C does not guarantee monotonic con- 
vergence. Indeed, small violations of monotonicity, for 
example between steps 1180 or 1300, are observed in 
Fig. [3] However, this is more than compensated for by 
the improved speed of convergence as compared to the 
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FIG. 3: (color online) Convergence of the first order and sec- 
ond order constructions for a Hadamard gate with a state- 
dependent cost (D = Paiiow, i-e. the second order construc- 
tion is not required). 
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FIG. 4: (color online) Convergence of the first order and sec- 
ond order constructions for state-to-state transfer with state- 
dependent cost. The operator D is taken to be the projector 
onto a forbidden subspace, i.e. the second order construction 
is required. 



first order and the conservative choices C = —ec- Very 
similar behavior is found for optimization of a state-to- 
state transition (data not shown). 

For D = P forbid) monotonic convergence needs to be 
ensured by a second order construction with C given by 
Eq. (|24]) . This choice corresponds to the green dashed 
line in Fig. U which studies convergence of the final-time 
objective, Jt, and the complete functional, J n orm, for a 
state-to-state optimization. Taking C somewhat smaller 
than the estimate of Eq. (|24[) may yield non-monotonic 
behavior, cf. the blue dotted line in Fig. @] If one ne- 
glects the second order contribution and the weight Xb 
is large, the algorithm completely fails (orange diamonds 
in Fig. [4]). For a small weight Xb, the algorithm con- 
verges to an optimum but non-monotonic behavior is ob- 
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FIG. 5: (color online) Convergence of second order construc- 
tions for a Hadamard gate with a state-dependent cost. The 
operator D is taken to be the projector onto a forbidden sub- 
space, i.e. the second order construction is required. 



served at intermediate iterations (black circles). Note 
that for small Ab, the constraint gt is almost not enforced 
due to insufficient weight. The best compromise between 
monotonic convergence and high fidelity is obtained for 
C = 2C {l+1 \k(p) (red solid line in Fig. SJ). For the pa- 
rameters for which the complete functional, J n0 rm, con- 
verges monotonically (green dashed and solid red lines 
in Fig. 3]), monotonic behavior is observed also for the 
final-time objective, Jt- In general, this need not be the 
case. We attribute it in the current example to our choice 
of the guess field which is relatively weak such that the 
forbidden subspace is not strongly populated. The al- 
gorithm therefore starts out in the 'right' direction for 
optimizing both targets, Jt and gb, and it does not need 
to optimize one target at the expense of the other. 

Fig. [5] presents convergence of the final-time objective, 
Jt, and the complete functional, J„ rm, for optimization 
of the Hadamard gate. Similarly to Fig.|H convergence is 
almost identical for the analytical estimate of C based on 
Eq. (|24p (green dashed line in Fig. [4]) and the numerical 
estimate according to Eq. (|26b[) (red solid line). A small 
violation of the analytical estimate (blue dotted line in 
Fig. [5]) leads to non-monotonic behavior but may yield 
larger fidelities after many iterations. 

To summarize our numerical investigations, a second 
order contribution can be employed to enforce monotonic 
convergence for functionals that are higher-order polyno- 
mials in the states or correspond to expectation values of 
non-semidefinite operators. The numerical estimate of 
the second order paramters might slightly violate mono- 
tonicity but yields the highest fidelities, especially as the 
optimum is approached. If a second order contribution 
is not required by the functional, it may nevertheless be 
used to improve convergence. Also in this case the nu- 
merical estimate of the second order parameters A and 
C turns out to be most efficient. 



V. SUMMARY AND CONCLUSIONS 

Applying Krotov's method [29| to quantum control, 
we have shown that monotonically convergent optimiza- 
tion algorithms are obtained for any quantum control 
problem provided that a second order construction is em- 
ployed. The equation for the optimized field then con- 
tains an additional term. Compared to a first order al- 
gorithm [I?}, only storage of the quantum states of the 
previous iteration and calculation of the second order 
weight are additionally required. We have shown that 
the parameters for the second order contribution can be 
estimated analytically based on the final-time target and 
intermediate-time 'costs', the equations of motion and 
the dependence of the Hamiltonian on the control field 
or calculated numerically from the optimization history. 
This is due to the normalization of quantum state vectors 
and finiteness of physical control fields, implying that op- 
timization is performed over compact sets of candidate 
states and controls, which has allowed us to significa ntly 
relax the conditions for Krotov's constructive proof [29| . 

We have illustrated the power of our approach by ap- 
plying it to two control problems for which no monoton- 
ically convergent algorithm existed - to target function- 
als that are higher-order polynomials in the states and 
to state-dependent constraints expressed as expectation 
value of a non-semidefinite operator. Target function- 
als that are higher order polynomials in the states arise 
for optimization towards an equivalence class of opera- 
tors rather than a specific operator. This is particularly 
relevant in quantum information where one is interested 
in the optimal evolution of a prim ary system alone, ir- 
respective of its environment [42j, or in the entangling 
content of two-qubit gates [28| . 

Our numerical examples illustrate that an analytical 
estimate of the algorithm parameters ensures monotonic 
convergence by taking all worst-case scenarios for opti- 
mization into account. However, if worst-case scenarios 
are not encountered, the analytical estimate imposes lim- 
its which are too severe, slowing down convergence. Es- 
timating the algorithm parameters numerically based on 
the optimization history turns out to be a more efficient 
choice. The numerical estimate of the second order pa- 
rameters can also be employed to speed up convergence, 
in particular close to the optimum where the first order 
contribution vanishes, for optimization problems where a 
second order construction is not strictly required. 

Note that the overall performance of the algorithm 
still depends on the the weights of the each term in 
the optimization functional and the optimization time T. 
The latter has little influence on the convergence once it 
is larger than the quantum speed limit [431 ] - The role 
of the weights is more subtle, in particular for multi- 



optimization problems [44j. A special role is taken by 
the weight, A a , of the term minimizing the field inten- 
sity: It determines the magnitude of the first order con- 
tribution to the new field analogously to the step size in 
gradient-type algorithms (23j . Since its modulus, |A |, 
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is a free parameter of the algorithm, its choice may be 
used to further improve the convergence speed. An effi- 
cient optimization method is obtained by choosing |A a | 
based on information from the second order derivative 
of the functional with respect to the field, estimated 
with the Broyden-Fletcher-Goldfarb-Shanno (BFGS) al- 
gorithm j45|. Note that in terms of convergence speed 
this goes beyond our approach which only makes use of 
information from the second order derivatives with re- 
spect to the states. 

The work presented here opens up a whole range of 
new applications for quantum optimal control. It pro- 
vides a general set of tools to study optimization of final- 
time functionals that are higher order polynomials in the 
states [28|, or optimization of time-dependent expecta- 
tion values that were suggested for the control of high- 
harmonic emission [32j, or optimization for non- linear 
equations of motion such as the Gross-Pitaevski equa- 
tion, time-dependent Hartree-Fock equations or time- 
dependent density functional theory [3, [l!| H3] . This set 
of tools allows for designing novel optimization function- 
als that capture the relevant physics without restriction 
to bilinear functionals. 
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Appendix A: Proof of optimality of the second order 
ansatz 

Konnov and Krotov base their proof upon the following 
condition set 21 [29[: 

1. The right-hand side of the equation of motion, 
/ (tp, e, t), is bounded, i.e. 3 K, L < oo : V (tp, e, t) € 
! 2 ™xEx [0,T],||ai|| > M=> f(p,t,t) < L-\\x\\. 

2. The Jacobian of the equation of motion is bounded, 
i.e. 3 A < oo : V (<p, e, t) R 2NM x Ex e [0,T] : 
P\\<A. 

3. The functionals Jt(0) and g(tp,e,t) are twice 
differcntiable and bounded, i.e. 3 K,L < 
oo : V<p e M. 2NM ,\\p\\ > L =}► J T (p) < 
K\\0\\ 2 and \g(p,e,t)\ < K\\p\\ 2 V (i, e) e [0,T] x 
E. 



Here, the real state vector tp(t) is a piecewise differen- 
tiable function for all t <G [0, T] and the control e is 
an element of the Banach space of continous real val- 
ued functions on the interval [0, T] with suprcmum norm 
||/ (x) Hoc = sup x6[0iT] i.e. e e (C[0,T] , || ■ |U) 

with || e || oo < E < oo. In quantum control, the com- 
ponents of tp(t) are the real and imaginary parts of the 



projections of N Hilbcrt space vectors on a suitable or- 
thonormal basis of the M -dimensional Hilbert space. The 
norm of the Jacobian is then the matrix norm of the 
2NM x 2NM matrix, either column or row norm. 

Based on 21, Konnov and Krotov proved the following 
theorem 0: 

Theorem 1 If the conditions 21 hold, then for each pro- 
cess (<fi^ (t), (t)) , there exists a solution for <!> to 
the extremization problem, 

RftQft),*® = mmij(V(i),e«(^) 

Vt€[0,T], (Al) 
G (T); $) = max G (<p(T)) , (A2) 

of the form 

*(6 7C, t) = x(t)- p{t) + \&(p{t) ■ ±{t) ■ Ap(t) , 
and the matrix function can be represented as 



t(t) = [a (« 



7 (T-t) 



-1 +p).g= t r(t)-M : 



where a, (3 < and 7 > are constants. 

With this theorem, the crucial part of constructing the 
algorithm, i.e. determination of a(t) and thus is re- 
duced to the determination of the constants a, ft and 7. 
Since the proof of Theorem [T] shows how to estimate the 
values of a, f3 and 7, we will sketch it here. It is based on 
the following Lemma which indicates why the conditions 
2t need to be imposed. 

Lemma 1 Let the function h (0) : M" — > M satisfy the 
following conditions: h € C (R") with C (R") denoting 
the space of continuous functions overW 1 , h is twice dif- 
fercntiable at with h = and V^/i (v?)|,^_g = and 
3K,L < 00 : \\<p\\ >L^h{(p)< K\\p\\ 2 . Then: 



sup 



h(0) 

Mil 2 



< 00 



When using Lemma [TJ two problems may arise in en- 
suring that the supremum is finite, (i) Small values of 
||<^|| I create a small denominator which may lead to large 
values of near Cp — 0. This is eliminated by the 

conditions of Lemma [TJ h ( 0) =0 and V^/i (¥>)U_g = 0. 

We show in Appendices IA II and IA 21 that all quantities 
taking the role oih{<p) satisfy these conditions so that we 
can employ Lemma [JJ (ii) Large values of \\p\\ may cause 
large values of h (jp) which in turn can lead to large val- 
ues of jj^p- as \\p\\ —> 00. The second problem is handled 
by imposing that h (tp) grows at most quadratically with 
\\tp\\ which is guaranteed by the conditions 21. Continuity 
of h(<p) then ensures finiteness of the supremum's argu- 
ment for all intermediate values of \\p\\, < \\p\\ < 00. 
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1. Final-time contribution to the second order 
ansatz — proof of existence of finite A 

Without any assumption on the normalization, i.e. 
without any reference to quantum control, we verify that 
A of Eq. (fT!)|) is well-defined with A < oo based on the 
conditions 21 and using Lemma [TJ Let ip denote the 
change in the state vector. Specifically, we have to check 
that, for 



the desired result, 



with Af defined in Eq. (TT5)) . (i) h is twice diffcrcntiablc 
at with h (fj) = 0, (ii) Vjh (fy g = and (iii) 

h (^ is bounded, i.e. finite constants K and L exist 
such that 3 K,L < oo : ||$| > L => h (^ < K\\%f\\ 2 . 

Obviously h is twice differentiable since / is twice diffcr- 
cntiablc and x(T) is a constant with respect to ip. Since 

J T (V (l) + *f) . -Jt = Jt (<f {l) ) - Jt = 

0, we find h (oj = 0. We check whether V (^ 
0, 



■0=0 



V>=o 




■0=0 




Since we have to check these conditions at t 
remembering that x(T) 



7 and 

W (T)), we obtain 



i/>=0,t=T 



X ' e ,/7 



(0(0 (T)) 



e ^(T) 



= 0. 

Finally, the condition set 21 tells us that Jt (<p) "*"= 
O (llV-ll 2 ), hence 

h (jPj = (x-Pj+Jr + - Jt 

im =°° o(uw)+o(ur)+o(i) 



and condition (iii) is fulfilled. We may thus use Lemma Q] 
which guarantees the existence of A. 
2. Intermediate-time contribution to the second 
order ansatz — proof of existence of finite B, C 

Analogously to the previous section, we now verify 
that B and C are well-defined with < B < oo and 
C > — oo without any assumption on the normalization 
of the state vector. The constant B defined in Eq. (j2"0|) 
is easily checked using Lemma \T\ Similarly to proving 
the existence of finite A, we have to check whether the 
conditions for Lemma [T] are fulfilled for 



(#)) = m ■ Af 



In analogy to Appendix IA 11 it is obvious that h is twice 
differentiable at with h I ) = 0. In order to check that 



i>(t)=o 



0, we use the product rule, 



V^(^(*) 



V(t)=0 



»/>(t)=0 



0. 



=0 



Finally, the condition set 21 tells us that ||/ {0) \\ = 
0(||$|), hence 

h(J)=j.Af m ^0(\m*), 

i.e. h is indeed quadratically bounded. We may thus 
use Lemma Q] which guarantees the existence of B < oo. 
Moreover, B > due to taking the square modulus inside 
the supremum. 



To prove the existence of finite C defined by Eq. (|22[) , 
we distinguish the two cases for large and small ||^>||. For 
large H'i/'MII; the argument of the infimum is finite. This 
follows from \{t) and x(t) being constant with respect 
to tp(t) and the fact that for large \\0\\, we have ||/|| < 
ATiH^H and \g\ < 7f 2 |M| 2 - Hence 

II -i/TC*) II — / _ \ 

x(f)-#)+xW-A/-A 9 < o(||^)|| 2 ). 
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For small H^WII we have to check that the zeroth and the expression for \ into the definition of C yields 
first order terms in the denominator disappear. Inserting 





■ $(t) + x(t) ■ A/ + V^g ■ $(t) - Ag 







C= inf '- -j- ^-^ , (A3) 

iJ(t)£R 2NM ;t£[Q,T] 



I 

For sufficiently small e and ||^(£)|| < £, we may approxi- This yields 
mate A/ and Ag to the first order, 



aj c v m f-m+o(\\n 2 ) 

Ag ~ V^ )5 -^) + 0(||^l| 2 ) 



(-v^ (t) / T • m) ■ m + v m9 ■ m + m • a/- a 9 











- V 0{t) g ■ iit) + O 


[wmw 2 ) 


(m ■ m) 





= 0(1) , 



i.e. C remains finite also for small ||^(i)||. 

Appendix B: Applying Krotov's proof to quantum 
control problems 

We adapt Konnov and Krotov's results [29( to the case 
that tp is composed of the M real and imaginary expan- 
sion coefficients of TV normalized quantum state vectors. 
In quantum control applications, some of the conditions 
21, cf. Appendix [Aj are trivially fulfilled. Specifically, 
the state vector tp is inherently bounded for any field e 
if tp is obtained from the equation of motion for a given 
e due to normalization. In particular, \\tp\\ cannot be- 
come larger than s/N and we may reduce the candi- 
date space M. 2NM for tp to a compact subset of M. 2NM , 

namely X = {V € R 2NM \ \\tp\\ < y/lS[} C R 2NM . The 

conditions concerning the behavior for large \\tp\\ are then 
trivially fulfilled because the complete space of interest 
[0, T] x X x E, that is the set of all {(t, tp, e)}, is compact. 
The boundedncss of the right-hand side of the equations 



of motion, the Jacobian and the functionals is then guar- 
anteed by simply asking / (t, tp, e), J, Jt(<P) and g (e, 0, t) 
to be regular. 

Due to the restriction of the states to the compact sub- 
set X C M. 2NM , the proof simplifies for quantum control 
applications which we use to obtain straightforward esti- 
mates of the constants a, (3 and 7. The changes in R and 
G due to variation of the state from the extremal point, 
tp^ 1 ', to all possible states, tp^ + tp, is measured by 

AG (V) = G (V (T) + $)-G (ft (T)) (31) 

-R(f*(t),e®(t),t) . (B2) 

Then the global cxtrcmum conditions, Eqs. (|A1I) - (|A2[) . 
correspond to AG (ip\ < and AR\ip(t),t\ > 0. In 

quantum control applications, any state tp^' + ip that 
is candidate for tp^ +1 ^ must also be normalized. Geo- 
metrically, all states <p^ and <p^ + ip lie therefore on a 
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0@ + 02 




FIG. 6: All admissible states, <^ (i) and <p (i+1) = <^ (i) lie on 



a sphere of radius X = y N around the origin. Therefore the 
norm of the change in the states states, ||^||, varies between 
and IX. 



sphere of radius X. The norm of the vectors ip varies 
between zero and 2X since for any two vectors 0^', 
(p\ i + 1 ) i n X the minimum distance is zero while the max- 
imum distance is 2X. This is illustrated in Fig. [HI The 
space for the state change vectors ip is then given by 



Y 



{v7 



p2NM 



p2NM 



30^ G X : 0® + $ E X j C 

In quantum control applications it is thus sufficient to 
vary the vector <^* +1 ) over the sphere of radius X around 
the origin instead of the full M. 2NM . 

With the definitions of A, B and C, cf. Eqs. (JTSJl, ([20]), 
(|22[) . a strict maximum condition for G and minimum 
condition for R is transformed into 

cr(T) < -2A, (B3) 

i<r(*)-|<r(t)|.fl + C > 0. (B4) 

A solution a(i) fulfilling these inequalities exists and it is 
straightforward to check that the Konnov and Krotov's 
ansatz Eq. (fT5|) satisfies (|B3[) and (|B4|) . More generally, 
the ansatz I29| 



a(i) = a(e^ T -*)-l) + 



/3 



with a, (3 < and 7 > leads to the inequalities 



~2~ 



/3 + 2A < 0, 
B/3 + C > 0, 



(B5a) 
(B5b) 



which have at least one specific solution, namely Eq. (|13l) , 
with 



C 

a = — - A , fl 



-A , 7 = 5. 



Note that if a set {ao, 0o, 70} fulfills the inequalities (|B5I) . 
then any other set {a, /?, 7} with a < ari < 0, (3 < f3 < 0, 
7 > 7o > does so, too. This flexibility can be uti- 
lized to estimate the constants A, B and C. Therefore 
the suprema in Eqs. (|15[) and (|20j) and the infimum in 
Eq. (f2"2"]l can be estimated analytically. 



Appendix C: Analytical estimate of the parameters 
of the second order contribution 



The arguments of the suprema, respectively of the in- 
fimum, in A, B and C can be expressed in terms of Tay- 
lor series starting at the first, respectively second, order. 
Evaluating the remainder term of these Taylor series, we 
obtain estimates for A, B and C. Let W (0) be a scalar, 
vector or matrix depending on 0. W n {0) denotes the 
Taylor expansion of W around 0^ starting at the n-th 
order or, in other words, W n (0) equals W (0) minus the 
first n — 1 terms of its Taylor expansion around 0^' . For 
example, we obtain for a scalar field 

W (0) = W , 

Wi{0) = W{0) - w (V l) ) , 

W 2 {0) = W{0)-W{(p {i)S \ 

(f*>) -{0-0 



<<) 



and so forth. The Taylor series starting at the n-th order 
can be approximated by evaluating the remainder term. 
For a scalar field W (0) this is given by 



w n {0) = n^^) = 



\a\—n 



for a c € (0, 1) and with ip = 0— <pW . Here a is a multi- 
index representing the 27VM-tuple of natural numbers 
including zero, 



2NM 



a * ' ~~i 



1 



2NM 



2NM 



and 
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2NAI 



,r=][ €- , (C2) 



(d a w) = n 



d ai W(0) 



dtp? 



(C3) 



The remainder can be estimated by 



j) <^AC (0^)r ,\a\=n, (C4) 



with 



sup d a W {ip 

ift€_Y; \ot\— n 



An estimate that is independent of the state <p\ l > is ob- 
tained by taking the supremum over all possible 0^ , i.e. 
we define 



M, 



sup M„ (0 (i A 

d a W (V (i) 
d a W . 



sup 

tpW eX;-!?e¥;|a|=n 



sup 

*eX;|a|=n 



(C5) 
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This method to estimate the Lagrange remainder term 
lends itself to an intuitive geometrical interpretation, cf. 
Fig. [6] For given <p^> , the Taylor expansion of W around 
(p^ % > starting at the nth order, W n , can be estimated 
by the supremum of the nth derivatives taken over the 
sphere around this state with radius = yo. To give 
an estimate of W n that holds for any (f^ l \ we have to 
calculate the supremum around all possible state vec- 
tors. Since all <p^ are located on a sphere around the 
origin with radius y/N, we simply need to take the supre- 
mum of the nth derivatives over all vectors within a ball 
around the origin with radius 2y/~N. Since the Lagrange 
form of the remainder is based on the mean value the- 
orem, the difference between two values of a function 

W (0^^ + ipj — W (0^) is estimated by the first deriva- 
tive of the function at some point between (p^ + ip and 



0W. Since both 0^> and (p^> 4- -0 arc located on the 
sphere of radius yf~N around the origin, any difference 
of a function between these two points can only concern 
values of derivatives inside this sphere. Therefore we can 
restrict the supremum to be taken over all states in X in 
the last line of Eq. ([C5]) . 

We now apply the estimate of the Lagrange remain- 
der to derive the constants A, B and C and use 
the braket notation in the following. Considering in 

Eq. (JTSJ) the Taylor expansion of J T {Wk\ T ) + A< M) 

in Ap k around (p^(T), the first order term cancels with 
(Xk(T)\Aipk{T)) + c.c. since \xk{T)) is given in terms of 
the gradient of Jt, cf. Eq. (|9b[) . The zeroth order term 

is nullified by -J T ({ipf{T)\ 



A . su MWf{T) + A<p k (T)}) - J T ({<P k l) (T)}) + Ef =1 [(Xk(T)\A<p k (T)) + (Ay fc (r)| Xfc (T))] 
{A U ^} Ef=i(A^(T)|A^(T)) 

' {A Vk }Eti(^MT)\Ap k (T)) ■ 



The argument of the supremum in the definition of A, 
Eq. (TT5)) . can therefore be viewed as the Taylor expansion 

of J T ({^(T) + A(p k }j around the tp k ] {T) starting at 

the second order, divided by Ylk=i (A^fcC? 1 ) \ Aip k (T)}. 
It is now possible to estimate Jt,i by its Lagrange re- 
mainder according to Eq. (|C4|) . 



-4 



sup 



Jt 



({A^ fc (T)}) 



{A^}£Li< A ^( r )l A ^( r )> 

\Mf (rf(T)}) Ef =1 (A^(T) I A<p k {T)) 



< sup 

{A<?k} 



Ef =1 (A^(T)|A^(T)) 
\MZ-({^{T)}) 

\ sup d a J T ({A^ k {T)}), 

1 {A<p k };|ot|=2 



yielding Eq. ([To]) . Note that for functionals Jt that are 
quadratic in the states, the global convergence condition 
(|B3p coincides with the local one, V^G < 0, that was 
used in Ref . [l6| . This can be seen by inserting the ansatz 
for $, Eq. ©, into VlG < 0, 



J T (V l) CO)| +a(T) < 0. 



To find an expression for the constant B defined by 
Eq. pp|) . we rewrite the change Af k in the equations 
of motion due to changes in the states, cf. Eq. (|18|) . 
in terms of the Taylor expansion of the Hamiltonian in 
Atp k (t) around the p k \t) starting at the first order, Hi, 
and obtain 



B < sup 

{A Vk };te[0,T] 



Eti (AfflkCf) Htfrk + Aip^eW) <p%'(t)) + {tp k (t) H[(<p k +Atp k ,eW) Apf \t) 



+2 sup 

{A Vk };te[0,T] 



£f =1 3m(A^) J H (^\t) + A<p k (t),e®,t) \ A Vk {t) 



Ef=i[(A^(t)|A^(t))] 

Using the Cauchy Schwarz inequality for the scalar products in the argument of the first supremum and Eq. (|C5P for 
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the second supremum yields 



B < 



sup 



{A Vk };k[0,T} \ J2k=l ( A Vfe(0 | A(p k (t)) fr[ 



N r 



y/(Atp k (t)\A<p k (t)) ■ 



Hi [<p k + Aip k ,e 



+2 sup 



E£=i M A^ (t) H (A^ fc (t) , e W , t) A<p k (t) 



Efc=i(A^(t)|A Vfe (t)) 



To evaluate the first supremum, we estimate the Taylor 
expansion of the Hamiltonian starting at the first order, 
Hi, by its Lagrange remainder, 



N 



*j=i 

2V 

= ]T sup <9 Q H (A^jfc) 

fc=l A Vfcil«l =1 

■y/(Atp k \Aip k ) ., 



(C6) 



Note that the absolute value of the derivatives is re- 
quired in the estimation of the Lagrange remainder 
since we need to estimate the norm of Hi. With 

££Li V^F(*rf(*)) = V^V, we obtain Eq. CD for S. 
In particular, for Hamiltonians that do not depend on the 
state, H(4 i) (t)+A^(t),eW) - A(^(t), c W) = 0, 
and the first supremum can taken to be zero. Then B 



is given by the second supremum alone which is twice 
the maximum absolute value of the imaginary part the 
Hamiltonian's eigenvalues. That is, the second supre- 
mum is non-zero only for non-unitary time evolution. In 
summary, for linear equations of motion and unitary time 
evolution, B = 0. The differential inequality system for 
a(t) then reduces to 



1 



a(T)+A < 0. 



-&(t)+C > 0. 



such that we obtain a linear solution for a(t), 

a(t) = C(T — t) — A. (C7) 



To estimate the constant C defined by Eq. 
rewrite it, using Eq. (|A3|) . 



we 



C = inf 

{A Vk };te[0,T] 



Eti [{xP® |/M +A^,e«)} + (/ M +A^,e«) |x^(*)>] -.92 (rf +A^}^ 



Eti (AMt)\^ k (t)) 



(C8) 

where l/fc.2) and <?2 are the Taylor expansions of \f k ) and g in Atp k (t) around tp k (t) starting at the second 
order. Expressing \f k ^) m terms of the Taylor expansion of the Hamiltonian starting at first order, |/fc,2) = 

Hi (<p k + Aip k ,e^j \ Aip k (t)), and introducing the approximation 



-C > sup 

{A Vfc };tG[0,T] 



sup 



Ef=i [(xi z) (t) I Hi fa® + A Vk , e«) J A Vfc (i)) + (A^(i) | H+ + A 



Ef = i <A^(t)|A^(t)) 



.92 (rf + Acp k }, 



{A Vk };k[0,T] Ef =1 (A^ fe (t) I Ap fc (t)} 



we may reuse our results for Hi, cf. Eq. (|C6[) . together ogously to that of Jt2> with Mf given by 



-M x ' to estimate the first term. The 



with M 

estimation of the second term involving 92 proceeds anal- 



Ml 



sup d a g({A<p k },t) 

{A Vk };te[0,T] 
\a\=2 



We thus obtain Eq. ([231) for C. For Hamiltonians that 
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depend on the state, the first term in the right-hand side 
of Eq. ([25)1 is non-zero. Note that the norm of all adjoint 

(i) 1 — 

vectors x k ^ s ec i ua l to yJV only for state-independent 

(i) 

constraint g. For state-dependent constraint g, x\ lii ^ ne 
solution of an inhomogencous Schrodingcr equation and 
its norm may be smaller or larger than y/~N . However, 

(i) 

the norm of all adjoint vectors xh ^ s always known since 



the x^k are calculated in the previous iteration step, i. 
So in order to estimate the first term of the right-hand 



side of Eq. ([25f , we only need to determine 
is also needed for estimating B. In this case 



<9H 



which 



-C is then 



given by the sum of the spectral radius of the operator 



<9H 



and the eigenvalue of D(£) with largest magnitude. 
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